Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition

نویسندگان

  • Hamed Karimi
  • Julie Nutini
  • Mark W. Schmidt
چکیده

I Simple proof of linear convergence. I For convex functions, equivalent to several of the above conditions. I For non-convex functions, weakest assumption while still guaranteeing global minimizer. ? We generalize the PL condition to analyze proximal-gradient methods. ? We give simple new analyses in a variety of settings: I Least-squares and logistic regression. I Randomized coordinate descent. I Greedy coordinate descent and variants of boosting. I Stochastic gradient (diminishing or constant step-size). I Stochastic variance-reduced gradient (SVRG). I Proximal-gradient and LASSO. I Coordinate minimization with separable non-smooth term (bound constraints or L1-regularization). I Linear convergence rate of training SVMs with SDCA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear Convergence of Proximal-Gradient Methods under the Polyak-Łojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show that gradient descent has a global linear convergence rate. This condition is a special case of the Łojasiewicz inequality proposed in the same year, and it does not require strong-convexity (or even convexity). In this work, we show that this much-older Polyak-Łojasiewicz (PL) inequality is actually weaker than the four mai...

متن کامل

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. The algorithm is ...

متن کامل

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-\L{}ojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older PolyakLojasiewicz (PL) inequality is actually weaker than the main condition...

متن کامل

Extensions of the Hestenes-Stiefel and Polak-Ribiere-Polyak conjugate gradient methods with sufficient descent property

Using search directions of a recent class of three--term conjugate gradient methods, modified versions of the Hestenes-Stiefel and Polak-Ribiere-Polyak methods are proposed which satisfy the sufficient descent condition. The methods are shown to be globally convergent when the line search fulfills the (strong) Wolfe conditions. Numerical experiments are done on a set of CUTEr unconstrained opti...

متن کامل

The Physical Systems Behind Optimization Algorithms

We use differential equations based approaches to provide some physics insights into analyzing the dynamics of popular optimization algorithms in machine learning. In particular, we study gradient descent, proximal gradient descent, coordinate gradient descent, proximal coordinate gradient, and Newton’s methods as well as their Nesterov’s accelerated variants in a unified framework motivated by...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016